Search CORE

37 research outputs found

Near-optimal experimental design for model selection in systems biology

Author: Buhmann Joachim M.
Busetto Alberto Giovanni
Dimopoulos Sotiris
Hauser Alain
Krummenacher Gabriel
Ong Cheng Soon
Stelling Jörg
Sunnåker Mikael
Publication venue
Publication date: 02/08/2017
Field of study

Motivation: Biological systems are understood through iterations of modeling and experimentation. Not all experiments, however, are equally valuable for predictive modeling. This study introduces an efficient method for experimental design aimed at selecting dynamical models from data. Motivated by biological applications, the method enables the design of crucial experiments: it determines a highly informative selection of measurement readouts and time points. Results: We demonstrate formal guarantees of design efficiency on the basis of previous results. By reducing our task to the setting of graphical models, we prove that the method finds a near-optimal design selection with a polynomial number of evaluations. Moreover, the method exhibits the best polynomial-complexity constant approximation factor, unless P = NP. We measure the performance of the method in comparison with established alternatives, such as ensemble non-centrality, on example models of different complexity. Efficient design accelerates the loop between modeling and experimentation: it enables the inference of complex mechanisms, such as those controlling central metabolic operation. Availability: Toolbox ‘NearOED' available with source code under GPL on the Machine Learning Open Source Software Web site (mloss.org). Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics onlin

RERO DOC Digital Library

A combined model reduction algorithm for controlled biochemical systems

Author: A Dokoumetzidis
A Gulati
AC Antoulas
AJ Laub
BJ Bornstein
C Li
D Degenring
E Klipp
G Li
G Liu
GE Dullerud
H Schmidt
HM Härdin
HM Sauro
I Smets
I Surovtsova
I Surovtsova
I Surovtsova
J Anderson
J Choi
J Hahn
J Wei
J Zobeley
JC Kuo
JD Murray
KR Schneider
M Apri
M Koschorreck
M Maurya
M Maurya
M Okino
M Sunnåker
M Sunnåker
M Tindall
Marcus J. Tindall
N Vora
PD Kourdis
PD Kourdis
PH van der Graaf
Piet H. van der Graaf
RR Vallabhajosyula
S Danø
S Sasagawa
S West
SR Taylor
Thomas J. Snowden
TP Prescott
TP Prescott
U Maas
V Petrov
W Liebermeister
ZP Gerdtzen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Background: Systems Biology continues to produce increasingly large models of complex biochemical reaction networks. In applications requiring, for example, parameter estimation, the use of agent-based modelling approaches, or real-time simulation, this growing model complexity can present a significant hurdle. Often, however, not all portions of a model are of equal interest in a given setting. In such situations methods of model reduction offer one possible approach for addressing the issue of complexity by seeking to eliminate those portions of a pathway that can be shown to have the least effect upon the properties of interest. Methods: In this paper a model reduction algorithm bringing together the complementary aspects of proper lumping and empirical balanced truncation is presented. Additional contributions include the development of a criterion for the selection of state-variable elimination via conservation analysis and use of an ‘averaged’ lumping inverse. This combined algorithm is highly automatable and of particular applicability in the context of ‘controlled’ biochemical networks. Results: The algorithm is demonstrated here via application to two examples; an 11 dimensional model of bacterial chemotaxis in Escherichia coli and a 99 dimensional model of extracellular regulatory kinase activation (ERK) mediated via the epidermal growth factor (EGF) and nerve growth factor (NGF) receptor pathways. In the case of the chemotaxis model the algorithm was able to reduce the model to 2 state-variables producing a maximal relative error between the dynamics of the original and reduced models of only 2.8% whilst yielding a 26 fold speed up in simulation time. For the ERK activation model the algorithm was able to reduce the system to 7 state-variables, incurring a maximal relative error of 4.8%, and producing an approximately 10 fold speed up in the rate of simulation. Indices of controllability and observability are additionally developed and demonstrated throughout the paper. These provide insight into the relative importance of individual reactants in mediating a biochemical system’s input-output response even for highly complex networks. Conclusions: Through application, this paper demonstrates that combined model reduction methods can produce a significant simplification of complex Systems Biology models whilst retaining a high degree of predictive accuracy. In particular, it is shown that by combining the methods of proper lumping and empirical balanced truncation it is often possible to produce more accurate reductions than can be obtained by the use of either method in isolation

Central Archive at the University of Reading

Crossref

Springer - Publisher Connector

PubMed Central

Mathematical and Statistical Techniques for Systems Medicine: The Wnt Signaling Pathway as a Case Study

The last decade has seen an explosion in models that describe phenomena in systems medicine. Such models are especially useful for studying signaling pathways, such as the Wnt pathway. In this chapter we use the Wnt pathway to showcase current mathematical and statistical techniques that enable modelers to gain insight into (models of) gene regulation, and generate testable predictions. We introduce a range of modeling frameworks, but focus on ordinary differential equation (ODE) models since they remain the most widely used approach in systems biology and medicine and continue to offer great potential. We present methods for the analysis of a single model, comprising applications of standard dynamical systems approaches such as nondimensionalization, steady state, asymptotic and sensitivity analysis, and more recent statistical and algebraic approaches to compare models with data. We present parameter estimation and model comparison techniques, focusing on Bayesian analysis and coplanarity via algebraic geometry. Our intention is that this (non exhaustive) review may serve as a useful starting point for the analysis of models in systems medicine.Comment: Submitted to 'Systems Medicine' as a book chapte

arXiv.org e-Print Archive

Crossref

Oxford University Research Archive

Methods of model reduction for large-scale biological systems: a survey of current methods and trends

Author: A Antoulas
A Debussche
A Dokoumetzidis
A Gulati
A Zagaris
A Zagaris
AN Tikhonov
AS Tomlin
B Moore
C Eckart
C Li
C Reder
C Salazar
D Degenring
DA Lauffenburger
DD Vecchio
DO Holland
E Flach
E Klipp
FJ Bruggeman
G Li
G Liu
GE Briggs
GE Dullerud
H Conzelmann
H Conzelmann
H Conzelmann
H Schmidt
H Schmidt
HG Kaper
HM Härdin
HM Sauro
HX Zhang
I Smets
I Surovtsova
I Surovtsova
J Anderson
J Choi
J Hahn
J Kirch
J Saez-Rodriguez
J Saez-Rodriguez
J Wei
JC Kuo
JD Murray
JJ Tyson
JP Whiteley
KR Schneider
L Michaelis
LH Hartwell
M Apri
M Apri
M Feinberg
M Koschorreck
M Maurya
M Maurya
M Sunnåker
M Sunnåker
Marcus J. Tindall
MK Transtrum
MS Okino
N Vora
NM Borisov
O Radulescu
O Radulescu
P Holme
PD Kourdis
PD Kourdis
Piet H. van der Graaf
PV Kokotovic
R Milo
RR Vallabhajosyula
S Danø
S Gay
S Hoops
S Lall
S Lam
S Lam
S Rao
S Skogestad
SR Taylor
SS Samal
T Maiwald
T Quaiser
T Turanyi
Thomas J. Snowden
TP Prescott
TP Prescott
U Maas
V Bykov
V Noel
V Petrov
W Klonowski
W Liebermeister
X Sun
Z Zi
ZP Gerdtzen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 27/06/2017
Field of study

Complex models of biochemical reaction systems have become increasingly common in the systems biology literature. The complexity of such models can present a number of obstacles for their practical use, often making problems difficult to intuit or computationally intractable. Methods of model reduction can be employed to alleviate the issue of complexity by seeking to eliminate those portions of a reaction network that have little or no effect upon the outcomes of interest, hence yielding simplified systems that retain an accurate predictive capacity. This review paper seeks to provide a brief overview of a range of such methods and their application in the context of biochemical reaction network models. To achieve this, we provide a brief mathematical account of the main methods including timescale exploitation approaches, reduction via sensitivity analysis, optimisation methods, lumping, and singular value decomposition-based approaches. Methods are reviewed in the context of large-scale systems biology type models, and future areas of research are briefly discussed

Central Archive at the University of Reading

Crossref

Complex genetic patterns in human arise from a simple range-expansion model over continental landmasses

© 2018 Kanitz et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Although it is generally accepted that geography is a major factor shaping human genetic differentiation, it is still disputed how much of this differentiation is a result of a simple process of isolation-by-distance, and if there are factors generating distinct clusters of genetic similarity. We address this question using a geographically explicit simulation framework coupled with an Approximate Bayesian Computation approach. Based on six simple summary statistics only, we estimated the most probable demographic parameters that shaped modern human evolution under an isolation by distance scenario, and found these were the following: an initial population in East Africa spread and grew from 4000 individuals to 5.7 million in about 132 000 years. Subsequent simulations with these estimates followed by cluster analyses produced results nearly identical to those obtained in real data. Thus, a simple diffusion model from East Africa explains a large portion of the genetic diversity patterns observed in modern humans. We argue that a model of isolation by distance along the continental landmasses might be the relevant null model to use when investigating selective effects in humans and probably many other species

Crossref

Serveur académique lausannois

Directory of Open Access Journals

UWE Bristol Research Repository

FigShare

A framework for parameter estimation and model selection from experimental data in systems biology using approximate Bayesian computation.

As modeling becomes a more widespread practice in the life sciences and biomedical sciences, researchers need reliable tools to calibrate models against ever more complex and detailed data. Here we present an approximate Bayesian computation (ABC) framework and software environment, ABC-SysBio, which is a Python package that runs on Linux and Mac OS X systems and that enables parameter estimation and model selection in the Bayesian formalism by using sequential Monte Carlo (SMC) approaches. We outline the underlying rationale, discuss the computational and practical issues and provide detailed guidance as to how the important tasks of parameter inference and model selection can be performed in practice. Unlike other available packages, ABC-SysBio is highly suited for investigating, in particular, the challenging problem of fitting stochastic models to data. In order to demonstrate the use of ABC-SysBio, in this protocol we postulate the existence of an imaginary reaction network composed of seven interrelated biological reactions (involving a specific mRNA, the protein it encodes and a post-translationally modified version of the protein), a network that is defined by two files containing 'observed' data that we provide as supplementary information. In the first part of the PROCEDURE, ABC-SysBio is used to infer the parameters of this system, whereas in the second part we use ABC-SysBio's relevant functionality to discriminate between two different reaction network models, one of them being the 'true' one. Although computationally expensive, the additional insights gained in the Bayesian formalism more than make up for this cost, especially in complex problems

Importance of incomplete lineage sorting and introgression in the origin of shared genetic variation between two closely related pines with overlapping distributions

Author: A Gelman
A Hobolth
A Willyard
AB Leslie
AH Orr
B Charlesworth
C Darwin
C Lexer
C Pardo-Diaz
C Roux
D Zhang
DB Neale
DB Wagner
DL Warren
DPL Toews
DS Gernandt
E Huerta-Sánchez
E Mayr
EM Leffer
F Staubach
F Tajima
FC Jones
FK Du
G Evanno
G Muir
G Ren
GA Watterson
GD Poznik
GM Hewitt
GM Hewitt
GP Ren
H Hasumi
HL Mogensen
J Elith
J Hey
J Hey
J Hey
J Liu
J Mallet
J Melo-Ferreira
J Ross-Ibarra
J Wakeley
J Wang
JA Coyne
JJ Wiens
JL Feder
JL Strasburg
K Csilléry
K Csilléry
K Tamura
KH Wolfe
KS Mao
L Chikhi
L Duvaux
L Duvaux
L Zhang
LG Fu
LH Rieseberg
LH Rieseberg
LH Rieseberg
LH Rieseberg
M Currat
M Nei
M Nei
M Sunnåker
MA Beaumont
MJ Hubisz
MK Kuhner
ND Levsen
NH Barton
O Savolainen
P Fearnhead
P Librado
P Nosil
P Taberlet
PW Hedrick
R Abbott
R Nielsen
RC Edgar
RJ Hijmans
RJ Petit
RJ Petit
RK Butlin
RR Hudson
RR Hudson
S Aeschbacher
S Liepelt
SJ Luo
SJ Phillips
SJ Phillips
SX Li
SX Li
TW Schoener
V Sousa
W Wachowiak
WD Collins
XF Ma
XR Wang
Y Li
Y Song
Y Sun
Y Zhou
YF Zhou
YF Zhou
YH Qu
YJ Won
Z Li
ZY Jiang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 21/09/2016
Field of study

Genetic variation shared between closely related species may be due to retention of ancestral polymorphisms because of incomplete lineage sorting (ILS) and/or introgression following secondary contact. It is challenging to distinguish ILS and introgression because they generate similar patterns of shared genetic diversity, but this is nonetheless essential for inferring accurately the history of species with overlapping distributions. To address this issue, we sequenced 33 independent intron loci across the genome of two closely related pine species (Pinus massoniana Lamb. and Pinus hwangshanensis Hisa) from Southeast China. Population structure analyses revealed that the species showed slightly more admixture in parapatric populations than in allopatric populations. Levels of interspecific differentiation were lower in parapatry than in allopatry. Approximate Bayesian computation suggested that the most likely speciation scenario explaining this pattern was a long period of isolation followed by a secondary contact. Ecological niche modeling suggested that a gradual range expansion of P. hwangshanensis during the Pleistocene climatic oscillations could have been the cause of the overlap. Our study therefore suggests that secondary introgression, rather than ILS, explains most of the shared nuclear genomic variation between these two species and demonstrates the complementarity of population genetics and ecological niche modeling in understanding gene flow history. Finally, we discuss the importance of contrasting results from markers with different dynamics of migration, namely nuclear, chloroplast and mitochondrial DNA

Crossref

PubMed Central

University of Oulu Repository - Jultika

White Rose Research Online

Model selection in historical research using approximate Bayesian computation

Formal Models and History Computational models are increasingly being used to study historical dynamics. This new trend, which could be named Model-Based History, makes use of recently published datasets and innovative quantitative methods to improve our understanding of past societies based on their written sources. The extensive use of formal models allows historians to reevaluate hypotheses formulated decades ago and still subject to debate due to the lack of an adequate quantitative framework. The initiative has the potential to transform the discipline if it solves the challenges posed by the study of historical dynamics. These difficulties are based on the complexities of modelling social interaction, and the methodological issues raised by the evaluation of formal models against data with low sample size, high variance and strong fragmentation. This work examines an alternate approach to this evaluation based on a Bayesian-inspired model selection method. The validity of the classical Lanchester's laws of combat is examined against a dataset comprising over a thousand battles spanning 300 years. Four variations of the basic equations are discussed, including the three most common formulations (linear, squared, and logarithmic) and a new variant introducing fatigue. Approximate Bayesian Computation is then used to infer both parameter values and model selection via Bayes Factors. Results indicate decisive evidence favouring the new fatigue model. The interpretation of both parameter estimations and model selection provides new insights into the factors guiding the evolution of warfare. At a methodological level, the case study shows how model selection methods can be used to guide historical research through the comparison between existing hypotheses and empirical evidence.Funding for this work was provided by the SimulPast Consolider Ingenio project (CSD2010-00034) of the former Ministry for Science and Innovation of the Spanish Government and the European Research Council Advanced Grant EPNet (340828).Peer ReviewedPostprint (published version

Public Library of Science (PLOS)

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Directory of Open Access Journals

PubMed Central

Edinburgh Research Explorer

FigShare

Quantifying the roles of host movement and vector dispersal in the transmission of vector-borne diseases of livestock

Author: A Alba
A Garcia-Saenz
A Wesolowski
A Wesolowski
AA de Koeijer
AJ Tatem
AJ Wilson
B Adams
B Hoffmann
C Ensoy
C Forbes
C Kirkeby
C Saegerman
C Szmaragd
C Szmaragd
CJ Sanders
CJ Sanders
D Anderson
Darren M. Green
DM Green
DM Green
E Brooks-Pollock
ECC Ågren
G Hendrickx
G Kluiters
G Kuno
J Gloster
J Turner
JA Backer
James Lloyd-Smith
K Græsbøll
L Sedda
L Sedda
M Gilbert
M Pioz
M Sunnåker
MEJ Woolhouse
PS Mellor
RC Reiner Jr
RF Sellers
RF Sellers
Richard J. Orton
Rowland R. Kao
RR Kao
RR Kao
S Carpenter
S Gubbins
S Gubbins
S Gubbins
Simon Gubbins
ST Stoddard
ST Stoddard
T McKinley
T Sumner
T Toni
Tom Sumner
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/04/2017
Field of study

The role of host movement in the spread of vector-borne diseases of livestock has been little studied. Here we develop a mathematical framework that allows us to disentangle and quantify the roles of vector dispersal and livestock movement in transmission between farms. We apply this framework to outbreaks of bluetongue virus (BTV) and Schmallenberg virus (SBV) in Great Britain, both of which are spread by Culicoides biting midges and have recently emerged in northern Europe. For BTV we estimate parameters by fitting the model to outbreak data using approximate Bayesian computation, while for SBV we use previously derived estimates. We find that around 90% of transmission of BTV between farms is a result of vector dispersal, while for SBV this proportion is 98%. This difference is a consequence of higher vector competence and shorter duration of viraemia for SBV compared with BTV. For both viruses we estimate that the mean number of secondary infections per infected farm is greater than one for vector dispersal, but below one for livestock movements. Although livestock movements account for a small proportion of transmission and cannot sustain an outbreak on their own, they play an important role in establishing new foci of infection. However, the impact of restricting livestock movements on the spread of both viruses depends critically on assumptions made about the distances over which vector dispersal occurs. If vector dispersal occurs primarily at a local scale (99% of transmission occurs <25 km), movement restrictions are predicted to be effective at reducing spread, but if dispersal occurs frequently over longer distances (99% of transmission occurs <50 km) they are not

Crossref

LSHTM Research Online

Stirling Online Research Repository (RIOXX)

Directory of Open Access Journals

Edinburgh Research Explorer

Enlighten

Stirling Online Research Repository

FigShare